arXiv:1507.07844v2 [cs.SY] 9 Aug 2015 


Composite Learning Control With Application 

to Inverted Pendulums 


Yongping Pan*, Lin Panll, and Haoyong Yu* 

*School of Biomedical Engineering, National University of Singapore, Singapore 117575, Singapore 
Email: biepany@nus.edu.sg; bieyhySnus.edu.sg 
^Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg, Luxembourg 

Email: lin.pan@uni.lu 

^School of Electric and Electronic Engineering, Wuhan Polytechnic University, Wuhan 430023, China 


Abstract —Composite adaptive control (CAC) that integrates 
direct and indirect adaptive control techniques can achieve 
smaller tracking errors and faster parameter convergence com¬ 
pared with direct and Indirect adaptive control techniques. 
However, the condition of persistent excitation (PE) still has to he 
satisfied to guarantee parameter convergence in CAC. This paper 
proposes a novel model reference composite learning control 
(MRCLC) strategy for a class of affine nonlinear systems with 
parametric uncertainties to guarantee parameter convergence 
without the PE condition. In the composite learning, an integral 
during a moving-time window is utilized to construct a prediction 
error, a linear filter is applied to alleviate the derivation of plant 
states, and both the tracking error and the prediction error are 
applied to update parametric estimates. It is proven that the 
closed-loop system achieves global exponential-like stability under 
interval excitation rather than PE of regression functions. The 
effectiveness of the proposed MRCLC has been verified by the 
application to an inverted pendulum control problem. 

I. Introduction 

Adaptive control is one of the major control techniques 
of handling uncertainties in nonlinear systems and has still 
attracted great concern in recent years KD-inii. In particu¬ 
lar, model reference adaptive control (MRAC) is a popular 
adaptive control architecture which aims to make an uncertain 
dynamical system behave like a chosen reference model. The 
way of parameter estimation in adaptive control gives rise to 
two different schemes, namely indirect and direct schemes 
im. Composite adaptive control (CAC) is an integrated direct 
and indirect adaptive control technique which aims to achieve 
better tracking and parameter estimation through faster and 
smoother parameter adaptation M- In the CAC, prediction 
errors are generated by identification models, and both tracking 
errors and prediction errors are applied to update parametric es¬ 
timates. The superiority of CAC on performance improvement 
has been demonstrated in many control designs, where some 
latest results can be referred to mi-isgi. Nevertheless, as in 
the classical adaptive control, CAC only achieves asymptotic 
convergence of tracking errors and does not guarantee param¬ 
eter convergence unless plant states satisfy the condition of 
persistent excitation (PE) Qt). It is well known that the PE 
condition is very strict and often infeasible to monitor online 
in practical control systems El. 

Learning is one of the fundamental features of autonomous 
intelligent behavior, and it is closely related to parameter 
convergence in adaptive control fSO). The benefits brought by 


parameter convergence include accurate closed-loop identifica¬ 
tion, exponential stability and robustness against measurement 
noise ll28l . An emerging concurrent learning technique pro¬ 
vides a feasible and promising way for achieving parameter 
convergence without the PE condition in MRAC El-El- 
The difference between the concurrent learning and the com¬ 
posite adaptation lies in how to construct prediction errors. 
In the concurrent learning, a dynamic data stack constituted 
by online recorded data is used in constructing prediction 
errors, singular value maximization is applied to maximize the 
singular value of the data stack, and exponential convergence 
of both tracking errors and estimation errors is guaranteed if 
regression functions of plant uncertainties are excited over a 
time interval such that sufficiently rich data are recorded in 
the data stack. Yet in this innovative design, the singular value 
maximization leads to an exhaustive search over all recorded 
data, and the requirement on the derivation of all plant states 
for calculating prediction errors is stringent. 

This paper proposes a novel model reference composite 
learning control (MRCLC) strategy for a class of parametric 
uncertain affine nonlinear systems. In the composite learning, 
a modified modelling error that can utilize online recorded data 
is constructed as the prediction error, a second-order command 
filter is applied to alleviate the derivation of plant states, and 
both the tracking error and the prediction error are applied 
to update parametric estimates. It is proven that the closed- 
loop system achieves global exponential-like stability under 
interval excitation (IE) rather than PE of regression functions. 
Consequently, the limitations of concurrent learning are alle¬ 
viated by the proposed MRCLC design. The effectiveness and 
superiority of this approach is verified by the application to an 
inverted pendulum control problem with the comparison with 
some existing adaptive control approaches. 

The notations of this paper are relatively standard, where 
K, R+, K" and denote the spaces of real numbers, 

positive real numbers, real n-dimensional vectors, and nx Tri¬ 
dimensional matrixes, respectively, | • | and || ■ || denote the 
absolute value and Euclidean-norm, respectively. Loo denotes 
the space of bounded signals, Uc := {x|||x|| < c} denotes the 
ball of radius c, min{-} and max{-} are the minimum and 
maximum functions, respectively, sgn(-) is the sign function, 
rank(A) is the rank of A, diag(-) is a diagonal matrix, and 
represents the space of functions whose fc-order derivatives all 
exist and are continuous, where c G K+, x G M", A G 
and n, m and k are positive integers. 


II. Problem Formulation 

This section discusses the formulation of learning from the 
classical MRAC. For clear illustration, consider a class of nth 
order affine nonlinear systems as follows lIZTl : 


where A := A — bkj is designed to be strictly Hurwitz. Thus, 
for any matrix Q = > 0, a unique solution P = > 0 

exists for the following Lyapunov equation: 

A^P + PA = -Q. (7) 


X = Ax + b(/(x) + u) (1) 

where A £ 6 := [0, • • • ,0,1]^, x(f) := [xi(t),X 2 it), 

■ ■ ■ , Xn{t)]'^ £ M” is the vector of plant states, u{t) £ M is the 
control input, and /(x) : R" i-A R is the model uncertainty. 
A reference model that characterizes the desired response of 
the system ([TJ is given by 

x^ = A^x^ -f bj-i' ( 2 ) 

with br := [0, • • • ,0,br]'^ € R", in which A^. £ R"^" is a 
strictly Hurwitz matrix, Xr(f) := [xri{t), ■ ■ ■ ,Xrn{t)]'^ £ R" 
is the vector of reference model states, and r{t) £ R is a 
bounded reference signal. This study is based on the facts that 
X is measurable, (A, b) is controllable, and /(x) is linearly 
parameterizable such that llT7l - ll29l 

/(x) = IL*^$(x) (3) 

in which W* £ C R^ is a unknown constant parameter 
vector, $(x) : R" i-A R^ is a known regression function 
vector, and Cw £ R+ is a known constant. The following 
definitions are given for facilitating control synthesis. 

Definition 1 ll27l : A bounded signal $(f) £ R" is of IE 
over [fe — Td,te] if there exist constants te,Td,a £ R’*' with 
te > Td such that {T)dT > <jI holds. 

Definition 2 Il27l : A bounded signal $(f) £ R" satisfies 
the PE condition if there exist constants cr, Td £ R^ such that 
$(T)$^(T)(iT > al holds, Vf > 0. 

Let Xre := [x^, r]^ be an augmented reference signal, and 
W £ R^ be an estimate of W*. Define a tracking error e(f) := 
x(f) — Xr{t), and an estimation error W{t) := W* — W{t). 
Our objective is to design a proper parametric update law of 
MRAC such that both e and W exponentially converge to 0 
under the IE rather than the PE conation. 

III. Composite Learning Control Strategy 
A. Review of Previous Results 

Erom El, the MRAC law can be designed as follows: 

u =—fcfe+fc^Xre —I^^$(x) (4) 

Upd ‘Ure tlad 

where Upd denotes a proportional-derivative (PD) feedback 
part, Ure denotes a reference signal feedforward part, Uad 
denotes an adaptive part, ke £ R” and kr £ R"+^ are control 
gains, and the design of kr satisfies 

bkjx.re = (Ar — A)Xr + brV. ( 5 ) 

Thus, one obtains the tracking error dynamics 


Let the adaptive law of W be as follows: 

W = P{W, 7 e'^P 6 $(x)) ( 8 ) 

where 7 £ R+ is a learning rate, and 'P{W, •) is a projection 
operator given by El 

( •-WW^ ■•/\\W\\^, 

P{W,A = l if ||IL|| >c^&iy ^-.>0 ■ 

[ •, otherwise 

Choose a Lyapunov function candidate 

V(z) = e^Pe/2 -f W'^W jifli) (9) 

with z := £ R"+^ for the closed-loop dynamics 

composed of (| 6 ]l and dH). It follows from the classical MRAC 
result of El that if IL( 0 ) £ and $(x) is of PE, then the 
closed-loop system (| 6 ]l with ® achieves global exponential 
stability in the sense that both the tracking error e{t) and the 
estimation error W (t) converge to 0. 

To remove the requirement of the PE condition on <I>(x) 
for parameter convergence in iflTl . a concurrent learning law 
of W is proposed in lIZTl as follows: 

p 

W= T’^lL, 7 e^P 6 <I'(x) + 7 ^ (10) 

j=t 

where j is a certain epoch, p > N is the number of stored 
data, and Cj := Wj^{xj) denotes a modelling error which is 
regarded as the prediction error calculated by 

(Xj - Axj - buj)-Wj'<h{xj) (11) 

where Xj , Uj and Wj are online recorded data of x, u and W at 
the epoch j, respectively. Let Z := [<I>(xi), $(x 2 ), • • • , ‘i>(xp)] 
£ RpxTV be a dynamic data stack. It is shown in the concurrent 
learning MRAC approach of IZTl that if Wiff) £ and 
rank(Z) = N on t > Te, then the closed-loop system (| 6 ]l 
with ([Tot achieves global exponential-like stability in the sense 
that both the tracking error e and the estimation error W 
converge to 0 on f > Te- Yet in the approach of lIZTl . fixed- 
point smoothing must be applied to estimate xj in (fTTl i such 
that the prediction error Cj is calculable, and singular value 
maximization should be applied to exhaustively search the data 
stack Z to maximize its singular value. 


B. Composite Learning Control Design 

Define a modified modelling error e{f) := QeW{t) as the 
prediction error, in which 


J 0 (f) for f < Te if 0 (<) < al 
\ 0(Te) for t>Te if 0(Te) > a I 


with a = maxt>o{crr(f)} and 


0(f) := 


4>(x(r))<h'^(x(T))cfT 


(13) 



in which Td G is an integral duration, G IR+ is the 

minimal singular value of 0 (f), and Tg > Td is the time when 
(Tr{t) reaches its maximum'. Then, a composite learning law 
of W is designed as follows: 

W = ^ proj(e^P 6 $(x) + fcu,e) (14) 

in which G is a weight factor. 

From e = and (fT 2 l i. the calculation of e can follow 

the calculation of QW as follows: 


e{t)wit) = &{t)w*-e{t)w{t) ( 15 ) 


where QW is directly obtainable by (fTST l and (fl4l i. and QW* 
is not directly obtainable. Multiplying both sides of (| 6 ]l by 
$(x(f)), integrating the resulting equality over [t — Td,t] and 
making some transformations, one gets 


Q{t)W* 



$(x)(e„ + iy^$(x) — bAe)dT 


(16) 


e,W G Loo- Since V(t) < 0 is satisfied Vx(0) G M" and 
14 in (|9]l is radially unbounded (i.e. 14(z) —)• oo as z —oo), 
the stability is global. Using e, lU G Loo, one also obtains 
X, W, e, u G Loo from their definitions. Thus, all closed- 
loop signals are bounded for all t > 0. 

Secondly, consider the control problem at f G [Te, oo). 
since there exist a G R"*" and Tg > Td such that Q{Tg) > al, 
i.e. the bounded signal $(x(f)) is exciting over f G [Te “ 
Td,Te\, it is obtained from ([T9l l that 

V < -e'^Qe/2 - k^aW'^W, yt>Tg. (20) 

It follows from (|9]l and (l20l i that 

V(t) <-ksV(t),^t>Tg (21) 

where ks := min{Ai„in(Q)/Amax(T), 27fcu,cr} G R+, which 
implies that the closed-loop system has global exponential-like 
stability in the sense that both e and W exponentially converge 
to 0 as long as f > Tg. □ 


in which the time variable r is omitted in the above integral 
part. Since e„ is unavailable, a second-order linear filter with 
unit gain is implemented as follows ED: 


j — ^n +1 

\ ^n+1 — “h CO {Cji Cn) 

(17) 

X = 

0 0 

1 ■ 
0 

x-b 

■ 0 ■ 
1 


IV. Application to Inverted Pendulum 
Consider the following inverted pendulum model ll27l : 

(lU*^$(x) -f u) 


with e„(0) = e„(0) and e„+i(0) = 0, where uj G is the 
natural frequency, C, G R"*" is the damping factor, and e„ and 
e„+i are estimates of e„ and e^, respectively. The integral in 
(IThl l effectively reduces the influence of measurement noise 
on the calculation of QW*. Thus, uj in (Ell can be made 
sufficiently small so that e„+i « e„. For simplifying analysis, 
assume that e„+i = e„ as in BTI . The reasonability of this 
assumption will also be verified in the subsequent simulations. 
The following theorem shows the main result of this study. 

Theorem 1: Consider the system ([T]i driven by the control 
law (S with dH, where the control gain is designed to 
satisfy (jD, and the control gain kg is selected to make A in (| 6 ]l 
strictly Hurwitz. If W{Q) G Llg^ and Q{Tg) > al are satisfied 
with c-uncr G R+ and Tg > Td, then the closed-loop system 
composed of (| 6 ]l and (O achieves global exponential-like 
stability in the sense that all closed-loop signals are bounded 
for all t > 0 and both the tracking error e[t) and the estimation 
error W(t) exponentially converge to 0 on f > Tg. 

proof: Firstly, consider the control problem at f G [0, cxd). 
Choose the Lyapunov function candidate V in for the 
closed-loop system constituted by (| 6 ]l and (O. The time 
derivative of V along (| 6 ]l is as follows: 

V =-e^Qe/2-l-W^(e^Fb>^(x)-W) (18) 

where © is utilized to obtain (118b . Applying (114b to (118b . 
noting VF(0) G and e = QgW, and using the projection 
operator results in El, one obtains 

V < -e'^ge/2 - k^W^QgW, Vf > 0. (19) 

Noting g > 0 and Qg > 0, one gets V (t) < 0, Vf > 0, which 
implies that the closed-loop system is stable in the sense that 

* The presence of Tg and a here is only used in the subsequent performance 
analysis and their values can only be gotten at the control process. 


in which W* = [1,-1,0.5]^ and <l>(x) = [sinxi, |a: 2 |a: 2 , 
ga:iX 2 jT -pjjg reference model is given as follows: 


For simulations, set x(0) = Xr(0) = [1,1]^, r{t) = 1 while 
t G [20, 25), and r{t) = 0 while t G [0, 20) U [25, oo) ll27l . 

The parameters selection of the proposed control law (|4]i 
with (fT4b is based on that of Ii27l . where the details are given 
as follows: firstly, solve (O to obtain kr = [— 1 , — 2 , 1 ]^; sec¬ 
ondly, select kg = [1.5,1.3]^ so that A is strictly Hurwitz; 
thirdly, solve (O with g = 10 / to obtain P; fourthly, set Td 
= 5 s in dob : fifthly, set 7 = 3.5, kw = b and = 5 in (Eb; 
finally, set w = 100 and ( = 0.7 in ([TTb . 

Simulations are carried out in MATLAB software running 
on Windows 7, where the solver is set to be fixed-step ode 
5 with the step size being 0.001 s and other settings being 
default values. The conventional MRAC in iflTl and the model 
reference CAC (MRCAC) with ^-modification in E2l are 
selected as baseline controllers, where the reference models 
and shared parameters of all controllers are set to be the same 
for fair comparisons. Simulation trajectories by all controllers 
are depicted Fig. [T] For the control performance, it is shown 
that the plant state x follows its desired signal x^ closely 
with a smooth control input u for all controllers, the MRCAC 
achieves the worst tracking accuracy [see Fig. [Tfc)], the 
proposed MRCLC achieves the best tracking accuracy [see 
Fig. die)]- For the learning performance, it is observed that IE 
occurs with a rising at f = 5 s in this case, the MRAC does not 
show any parameter convergence [see Fig.[TJb)], the MRCLC 
shows better parameter estimation than the MRAC but still 
does not achieve parameter convergence [see Fig. Ed)], and 
the proposed MRCLC achieves fast and accurate parameter 
estimation even the IE is short and weak [see Eig. Ef)]- 












Control input li State tracking Xr, x Control input u State tracking Xr, x Control input u ^ State tracking x^, 





Fig. 1. Simulation trajectories by all controllers, (a) Control performance by the MRAC. (b) Learning performance by the MRAC. (c) Control performance by 
the MRCAC. (d) Learning performance by the MRCAC. (e) Control performance by the MRCLC. (f) Learning perfoiniance by the MRCLC. 


































































































V. Conclusion 

In this paper, a MRCLC strategy has been successfully 
developed for a class of parametric uncertain affine nonlinear 
systems such that parameter convergence can be guaranteed by 
the IE rather than PE condition. The proposed approach has 
also been applied to an inverted pendulum model, where supe¬ 
rior control and learning performances have been demonstrated 
compared with the conventional MRAC and the MRCAC with 
Q-modification. Eurther work would focus on the extension of 
the composite learning to wider classes of nonlinear systems 
such as multi-input multi-output affine nonlinear systems M 
and strict-feedback nonlinear systems m- 
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